A LOWER BOUND FOR THE MIXING TIME OF THE 
RANDOM-TO-RANDOM INSERTIONS SHUFFLE 



By Eliran Subag* 
Texhnion 

The best known lower and upper bounds on the mixing time for 
the random-to-random insertions shuffle are (i — o(l))nlogn and 
(2-1-0 (1)) jilogn. A long standing open problem is to prove that the 
mixing time exhibits a cutoff. In particular, Diaconis (2003) conjec- 
tured that the cutoff occurs at |nlogn. Our main result is a lower 
bound of tn ~ {j — o(f)) nlogn, corresponding to this conjecture. 

Our method is based on analysis of the positions of cards yet-to- 
be-removed. We show that for large n and t„ as above, with high 
probability the number of cards within a certain distance from their 
initial position is the same under the measure induced by the shuffle 
and under the stationary measure, up to a lower order term. However, 
under the induced measure, this lower order term is dominated by the 
cards yet-to-be-removed, and is of higher order than for the stationary 
measure. 



1. Introduction. In the random-to-random insertions shuffle a card 
is chosen at random, removed from the deck and reinserted in a random 
position. Assuming the cards are numbered from 1 to n, let us identify an 
ordered deck with the permutation a £ Sn such that a (j) is the position 
of the card numbered j. The shuffling process induces a random walk 11^, 
t = 0, 1, . . ., on Sn- Let be the probability measure corresponding to the 
random walk starting from a G Sn- 

Clearly, IIj is an irreducible and aperiodic Markov chain. Therefore 
P" (lit £ ■) converges, as t — )• oo, to the stationary measure C/", the uniform 
measure on Sn- To quantify the distance from stationarity, one usually uses 
the total variation distance 

dn it) ^ max \\P: {Ut G •) - U^^v = (H* G •) " U^^v - 

where equality follows since the chain is transitive. The mixing time is then 
defined by 

4t(e)^min{t: d„(0<e}. 
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In order to study the rate of convergence to stationarity for large n, one 
studies how the mixing time grows as n — t- oo. In particular, one is interested 
in finding conditions on such that linin^oodn (tn) equals or 1. 

The random- to- random insertions shuffle is known to have a pre-cutoff of 
order O (nlogn). Namely, for ci = ^, C2 = 2: 

(i) for any sequence of the form tn = cinlogn — kn.n with lim„_j,oo kn = oo, 
hm„^oo dn (tn) = 1; and 

(ii) for any sequence of the form t„ = C2n log n + knn with lim„_j.oo kn = oo, 
hm„^oo dn (tn) = 0. 

In Diaconis and Saloff-Coste (1993) the mixing time is shown to be of or- 
der O(nlogn). Uyemura- Reyes (2002) uses a comparison technique from 
Diaconis and Saloff-Coste (1993) to show that the upper bound above holds 
with C2 = 4 and proves the lower bound with ci = ^ by studying the longest 
increasing subsequence. In Saloff-Coste and Ziihiga (2008) the upper bound 
is improved, also by applying a comparison technique, and shown to hold 
with C2 = 2. An alternative proof to the lower bound with ci = | is also 
given there. 

A long standing open problem is to prove the existence of a cutoff in total 
variation (see Diaconis and Saloff-Coste (1995); Diaconis (2003)); that is, a 
value c such that for any e > 0: 

(i) for any sequence tn < {c — e) nlogn, lim„_^oo dn (tn) = 1; and 

(ii) for any sequence tn > {c+ e)n log n, lim„^oo dn (tn) = 0. 

In particular, in Diaconis (2003) it is conjectured that there is a cutoff at 
|nlogn. 

Our main result is a lower bound on the mixing time with this rate. 

Theorem 1.1. Let tn = jnlogn— |nloglogn — Cnn with lim„_^ oo Cn = 
oo. Then lim.„^oo dn (tn) = 1- 

The proof is based on analysis of the distribution of the positions of cards 
yet-to-be-removed. Let [n] = {l,...,n} and denote the set of cards that 
have not been chosen for removal and reinsertion up to time thy A'' = 
The following result describes the limiting distribution for a card in as 
the size of the deck grows (in the sense below). 

Let =^ denote weak convergence and N (0, 1) denote the standard normal 
distribution. 
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Theorem 1.2. Let jn G [n] and G N 6e sequences. Assume that 
7 = liuin^oo — exists, and that 



Then 



where 



pn 



lim 



n 



lim 



tnjn {n - jn) jn " jn) 



0. 



Jr. 



\/2tnXn 



jn G 



P{N (0,1) G 



(in 

n 

n 



^/7 = 0, 



l7(l-7) i/7e(0,l). 



This can be explained by the following heuristic. For i G [n] not too close 
to 1 or n, for m < t, the conditional transition probabilities (given in (2.2) 
below) 

(n^+i (j) = i + fc| n„, (j) = i,je A') , 

are close to symmetric. Thus, under mild conditions on t and j, conditioned 
on j G A^, we expect maxo<m<t (j) — j\ to be small. On a sufficiently 
small neighborhood of j the transition probabilities above (which depend on 
i) hardly vary. Therefore (j) — j is roughly a sum of t 'small' i.i.d random 
variables. 

To distinguish PJ} {Ut„ G •), with t n as in Theorem 1.1, from [/", we study 
the size of sets of the form 

Aa (a) ^ |j G : |(7 (j) - j\ < a^n log n} , a G Sn, 

where = [n] n [n (1 - e) /2, n (1 + e) /2], for fixed e G (0, 1) and a pa- 
rameter a > 0. We shall see that, as long as Cn does not approach oo too 
fast, 

|Aq| / (2ea^/n]ogTlj — )• 1 in probability, 
under both measures. That is, the probability that 

Aq| / ( 2ea\/nlogn) — 1 > 6 



approaches 0, as n — >■ oo, for any 6 > 0. However, the deviation |Aq,| — 
(2ea^/nTogn) , which for P^{Ilt^ G •) is dominated by lAcn^d*"], i.e. by 
the cards yet-to-be-removed, is of different order for the two measures. 
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In Section 2 we prove Theorem 1.2 and other related results. We analyze 
the distribution of | A^j under [/'", and the distributions of | n ^4*" | and 
I Aa \ A*" I under in Section 3. The proof of Theorem 1.1, given in Section 
4, then easily follows. Lastly, in Section 5 we prove two results which are 
used in the previous sections. 

2. The Position of Cards Yet-to-be-Removed. In this section we 
prove Theorem 1.2 and other related results. 
The increment distribution of lit is given by 



(2.1) 



1/n 
2/n2 
l/n2 




if T = id, 

if T = with 1 < i, j < n and \i 



■J\ 



if T 



with 1 < i, j < n and \i — j\ > 1, 



otherwise. 



where Cij is the cycle corresponding to removing the card in position i and 
reinserting it in position j, that is 

id if i = j, 

Cij = S (i, J - 1, • • • , « + 1, if i < J, 

Let 2 < n G N and j € [n]. Under conditioning on |j G A*}, 11^ (j), tn = 
0, . . . , t is a time homogeneous Markov chain with transition probabilities 



n(j) A TDn 
l'i,i+k 



i^?^(n^+i (j) = i + /c|n„ {j) = i. 



3 G A') 



(2.2) 



( i{n—i) 
n(n— 1) 
(i-l)(n-i+l) 
n{n—l) 



n(n— 1) 



if A; 
if A; 
if fc 



+1, 
-1, 
0, 







otherwise. 



One of the difficulties in analyzing the chain is the fact that the transition 
probabilities p^^^^j^ are inhomogeneous in i. To overcome this, we consider a 
modification of the process for which inhomogeneity is 'truncated' by setting 
transition probabilities far from the initial state to be identical to these in 
the initial state. As we shall see, a bound on the total variation distance of 
the marginal distributions of the modified and original processes is easily 
established. 
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For j G [n] and M > 0, let j ± M = [n] H [j - M,j + M] and let Cn 
Cm^'^'\ m = 0, 1, . . . be a Markov process starting at Co = j, with transit 
probabilities Pl'l^l — P (Cm+i = ^ + ^| Cm = such that 



n(i) 



Clearly, for any sequence {km)ln=o ^ ^*^"'^) if niaxo<m<t I — j I < 
then 

(2.3) p {{Ul=o = ikmt.=o) = PJk ( (n™ (i))^=o = (^-)m=ol 3 e • 
Therefore, by taking complements, for any u < M, 



(2.4) f^"^ max |n™(j)-j| >n 

0<m<t 



J G ^* ) = P ( max \Cm - j\ > u] . 



Moreover, (2.3) implies that for any B C Z*"*"^ 

Pid ( (n™ (j))t^=o e ^1 J e A*) - P {{Cm)l=o e ^) 

= PPa ( (n„, (i))^^o G P, max |n^ (j) -j\>M 
- (iCm)l=o e max ICm - j\ > M 

\ 0<m<t 



Since both terms in the last equality are bounded from above by the equal 
expressions of (2.4) (and from below by zero), it follows that 



iPPd ( (n^ (j))m=o ^■\j(^A')-P {{Cm)l=o e •) I 



TV 



(2.5) <P(^maxJCm-i| >M 



A simple computation shows that 

n(j) 



n(j) _ n(i) 

Pi,i-1 



is bounded by ^ for 



any i. On the other hand, is roughly equal to i{n — i) /n . Thus if j 

is large enough and M, and thus |j it M|, is small compared to j, we can 
think of Cn'J'^^ ^ ^ perturbation of a random walk with a very small bias. 
In order to make this precise we decompose Cm'''*^ as a sum of a random 
walk determined by the increment distribution in state j and two additional 
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random processes related to the 'defects' in symmetry and homogeneity in 
state. 

Consider the vector-valued Markov process 

Y V \ — ( Q"-'iM vndM vn,j,M\ 
V'-'rri) J-m) — V m ' m i ) 

starting at {Sq,Xq,Yq) = (0,0,0) with transition probabilities as follows. 
For each k G 7j define 

(2.6) Qk = mm jp^j/^^ , pl'^i:_^j , 

rfc = max|p^'^^,p^;;^'_^|. 
For a state (ii, ^2; ^3) set i = h + 12 + is and define 

Wi = arg max py-r- ■ , ■ , . , 

Zi = sgn {qj - qj+i) , 

where sgn is the sign function (the definition of sgn at zero will not matter 
to us). Define the transition probabilities by 

P {{Sm+l,Xm+l,Ym+l) = [h + ki,i2 + /C2,i3 + ^s)! {Sm,Xm,Yra) = (^1,^2,^3)) 



min{gj+i. 


Qj} 


if (fci 


^2 


h) = 


(+1,0,0), 


min{gj+i. 


Qj} 


if (fci 


^2 


h) = 


(-1,0,0), 






if (fci 


^2 


h) = 


(+1±2.,_1,0) 






if (fci 


k2 


k3) = 


(_1±2.,+1,0) 


rj+i - qj+ 


ii 


if {hi 


^2 


k3) = 


{0,0, Wi), 






if {ki 


^2 


k3) = 


(0,0,0). 



where q is chosen such that the sum of probabilities is 1. 

It is easy to verify that {Sm + X^ + l^)^=o ^ Markov process with 
transition probabilities identical to those of (Cm — j)m=o- Therefore the two 
processes have the same law. It is also easy to check that Sn is a random 
walk with increment distribution 

H{+1)= fi{-l)=qj , ^(0) = l-2g,. 

In order to study X^ and we need the following proposition. 
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Proposition 2.1. Let {^rn.}m=o '^"-^ {-Sm}m=o integer-valued ran- 
dom processes starting at the same point Aq = Bq. Suppose that there exist 
pfk ^ -s^c/i that for any m > and k, i,iQ, . . . , im-i G ^ (such that the 

conditional probabilities are defined) 

pff. = P{Am+i = k\ Am+i ^i,Am = i) 

= P{Am+l = k\ Am+l ^i,Am = «,^m-l = im-1, • • • , ^0 = ^o) 

and similarly for Bm. with pf^^. Assume that for any i,k G p^ = pf^.. 
Finally, suppose that for any m > and k,i,io, . . . , im-i,jo, ■ ■ ■ ,jm-i S 
(whenever defined) 

P{Am+i / i\ Am = i,Am-i = im-i, ■■■ ,Ao = io) 

> P {Bm.+i / i\ Bm = i, Bm~l = jm-1, ■ ■ ■ , Bq = jo) ■ 

Then for any i G N and 6 > 

P ( max Am > S ] > P i max Bm > S 

\0<m<t J \0<m<t 

and 

max \Am\ > ^ I ^ ^ I max \Bm\ > 5 

Q<m<t J \0<m<t 

Proof. The processes {^m} and {Bm} can be coupled so that they jump 
from a given state to a new state according to the same order of states, say 
according to the order {km}m=o^ such that the amount of time that 
{Bm} spends in any given state km before jumping to state km+i is at least 
as much as {^m} spends there. The proposition follows easily from this. □ 

The only nonzero increments of Xm are ±1. Note that 



P I Xm+l — im + ^ 



P (Xm+l — im + ^ 



Xm+1 / im, {-^pIpLo — {V}^ 

Xm+1 7^ i-rm {^p}p=o ~ {v}p=0 ' 



{fci,fc2)ez2 

Sm — ki , Ym. = k2] P [ Sm = ^1 j Ym = ^2 
_ 1 

~ 2' 



Xm+1 7^ irm {^p}p=o ~ {^p}p=0 



The last equality follows from Markov property of {Sm, Xm, Ym)- The same, 
of course, holds for the negative increment. In addition, again by Markov 
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property, 



P yXm+l / im \ — {vl^Oy 

= ^ -P [X„i+i / im {Xp}'^^Q = {Vl^o iSrn = ki,Yrn = k2] X 
(fcl,fc2)6Z2 



p I — k\ , 



< max P ( Xm+i 7^ V 

fcl,fc2 



< 2M max ( max 

xG[l,n] \ 
2M 



-'^m = Sm = ki.Yjn = k2] < 2 max |gr^ - qj\ 

iej±M 



d x{n — x) 



dx n{n — 1) 



d (x — 1) (n — X + 1) 



dx 



n (n — 1) 



< 



n — 1 



where the maximum in the first inequahty is over aU ki , k2 such that the 
conditional probabihty is defined. 

Thus, according to Proposition 2.1, for 5 > 0, 

(2.7) P ( max > s] < P f max > 5) , 

\\)<m<t J \0<m<t J 

where Wm = Wm'^^ is a random walk starting at with increment distribu- 
tion 

1, {+1) = u (-1) = , z/(0) = l-2- 



n — 1 



n — 1 



Similarly, for the process Yt = X]m=i ~ Ym-i\, whose increments are 
and 1, we have 



P Y 



' m+l '"m 



< max P ( Ym+l 7^ im I Yjn — im , 'S'rTi — ki^X^ 
ki,k2 



< max (rj+j — gj+j) = max 
ieZ i<=j±M 



m - k2) 

i{n — i) (i — 1) (n — i + 1) 



n (n — 1) 



n (n — 1) 



max 



n - 2i + 1 



iej±M n (n - 1) 
Therefore, for 5 > 0, 



1 

< -. 

n 



(2.8) P ( max > 5) < P (y^ > 5) < P (iVf > 5) , 

\0<m<t J \ / 
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where Nt = ~ Bin (t, ^) . 

Since the increment distributions of Wm and Sm are symmetric, the clas- 
sical Levy inequality yields, for any 5 > 0, 

(2.9) P ( max \Wrn\ > 5] < AP {Wt > 5) . 

yo<m<t J 

and 

(2.10) P \ max \Sm\ > 5] < AP {St > 5) . 

\0<m<t J 

Having established the connections between the different processes, we 
are now ready to prove Theorem 1.2. 

Proof of Theorem 1.2. The case where 7 = 1 follows by symmetry from 
the case with 7 = 0. Assume 7 G [0, 1). In this case, the hypothesis in the 
theorem are equivalent to 

lim = lim — ^ = 0. 

tnjn '^->-°° njn 



n— ^00 



Let n G N, j G [71] and M > 0. Based on (2.7)-(2.9) and a union bound, 
for u G M, 5 > 0, we have 

PiCt-j>u) 

<P{St>u-6) + p( max \XJ > ^ ) + P ( max \Ym\ > J 

\0<m<t 2 J \0<m<t 2 

<P{St>u-6)+4p(wt>^^+P (^Nt > . 

Similarly, 

PiCt-j>u) 

> PiSt>u + 5) - P ( max > -] - P ( max \Y^\ > - 

\0<m<t 2 J \0<m<t 2 

>PiSt>u + 6)-4p(wt>^^-P (^Nt > . 

Assume | — ;^ > 0. By computing moments and applying the Berry-Esseen 
theorem to approximate the tail probability function of St, and applying 
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Chebyshev's inequality to bound the tail probability functions of Wt and 
Nt, we arrive at 



(2.11) p(cr-«-.>»)<*feU^+. 



t n-l 
n n 



and 

(2.12) P(C^'"' -i > n) > ^ 



n,jM , ^ ^ ( ^i±l^ C 32Mt 



2 ' 



where qj is defined in (2.6), C is the constant from the Berry-Esseen theorem 
and ^ is the tail probability function of a standard normal variable. 

For two sequences of positive numbers Vnjv'^ let us denote u„ <C if 



and only if liuin^ooVn/Vn = 0- By assumption, a/^^ ^ jn, therefore we 



can choose a sequence M„ such that a/ ^^7^ ^ Af„ ^ jn- Similarly, since 



Mn < jn we can set A„ with J^-^ <C 5„ < As assumed, ^ < j„ 



and 1 < Therefore ^,1 < y „ < which also implies that 

n ' n ^ 

Now, let x € M and set Un = x\/2tnXn- Let us consider the inequalities 
derived from (2.11) and (2.12) by replacing each of the parameters by a 
corresponding element from the sequences above. Based on the relations 
established for the sequences and the assumptions on t„ and jn it can be 
easily verified that, upon letting n — ?• 00, all terms but those involving ^ 
go to zero. Relying, in addition, on the fact that ^ is continuous, it can be 
easily verified that 

Hence we conclude that 

(2.13) hm P (Q^-'^- - jn >Un)=^ (x) . 
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Based on (2.7)-(2.10), 



(2.14) 



P max IC^'^^-il > M 

\0<m<t ' ' ~ 

< P ( max \Sm\ > M -6] + P ( max IX^I > - 

- \0<m<t^ J \0<m<t^ rnl - ^ 

f , , ^ 

+ P max Ym > - 

\0<m<t 2^ 

<4P{St>M-6)+ AP (Wt > + P (^iVt > ^ 



{M-6f <52(n-l) (f-i)^' 



where the last inequality foUows from Chebyshev's inequality. 

As before, replace the parameters by the corresponding elements from 
the sequences and let n — )■ oo. The middle and right-hand side summands of 



(2.14) were already shown to go to zero as n — t- oo. Since 6n <^ y ^ M„ 
the additional term also goes to zero. Combined with (2.5) and (2.13) this 
gives 

lim P^a {Ut„ Un) - in > Un\ jn S A*") = ^ {x) , 

which completes the proof. □ 
In Theorem 1.2 for each n only a single card j„ of the deck of size n is 
involved. The following gives a uniform bound (in initial position and in 
time) for the tail distributions of the difference from the initial position. 

Theorem 2.1. Let a > and let tn be a sequence such that lim^^^^oo tn = 
limn^co /tn = OO- Then 



lim sup max i^'^ ( max \Um (j) - j\ > aJ^-^ 



j £ A^" ] < 4^ (a) . 



Proof. Set Un = aJ^ and jn = |_§J. Let M„ > u„ and (5„ > be 
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sequences to be determined below. From (2.4) it follows that 

j G A'' 



T^^^PPd „ max (j) - j| > Un 

jG[n] \0<m<t„ 

= maxP ( max jCm"''^^" — j\ > Un 

je[n] \0<m<tn 

< max I P ( max I S"^^'^'" I > n„ - (5„ ) + 

je[n] [ \0<m<tr, I '"■ I J 



P max rWC'*^" > — + P iV" > — 

It is easy to check that for the random walks S'm"''*^", j G [n], the proba- 
bilities of the nonzero increments, ±1, are maximal when j = jn- Therefore 
according to Proposition 2.1, 



maxi^d ( ,max jn^ (j) - j| > Un 

J6[n] yO<m.<t„ 



< P max >tx„-5„ + 



P ( max > ^ V i^' f A^f > 



From (2.9) and (2.10) we conclude that 

j G ^4*" 



^^^Pid „max |n„ (j) - j| > 

je[n] \0<m<t„ 

< 4 |p (5,^"'^^" > n„ - d„) + P > 1^ + P (^AT- > I 

Finally, note that jn and t„ meet the conditions of Theorem 1.2 with 
7 = ^. Therefore, defining M„, and 5n as in the proof of the theorem (which 
also implies M„ > Un) and following the same arguments therein, 

= q,( lim I = ^ (a) . 

This completes the proof. □ 
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3. Cards of Distance 0{^/rl\ogn) from their Initial Position. 

The results of Section 2 show that the position of a card that has not been 
removed is fairly concentrated around the initial position. This, of course, 
is a rare event for each card under the uniform measure U"'. On the other 
hand, as the number of shuffles performed grows (i.e., as t grows), more 
cards are removed for reinsertion. 

To distinguish the measure induced by the shuffle from the uniform mea- 
sure, we consider the size of sets of the form 

Aa (a) ^ [j E : \a (i) - j| < a^/nlogn} , a G 

where D"- = [n] n [ii (1 — e) /2, n (1 + e) /2] and e G (0, 1) is arbitrary and 
will be fixed throughout the proofs. Under for i ^ j, the events {i G Aq,} 
and {j G Aq} are 'almost' independent, as n — )• oo. Therefore one should 
expect \A^\-E{\Aa\} tohe of order (S^" {| A„|}) Under i^?^ (Ht^ G •), 
if I A*" I is relatively small, it seems natural that the positions of the cards 
that have been removed are distributed approximately as they would under 
U"-. Thus, \Aa \ A*" I under P^^ {Ut„ G •) should be distributed roughly as 
I Aq| is under f/". By that logic, we need to choose t„ so that | Aq, n A^"\ is 

n 1 /2 

larger than (i?^" {|Aq|}) with high probability. Requiring the expecta- 

1 /2 

tion Ef^ { I A„ n I } to be larger than {| Ac,|}) ' leads us to set t„ 

to be |nlogn, up to a lower order term. 

In order to prove the lower bound on the mixing time, we first prove the 
three lemmas below. The first treats the distribution of | Aq,| under f/". The 
other two deal with | Aq, n A^" \ and | Aq \ A^" | under P.*^ (IIj^ G •)• 

Let Rj denote the event {j G Aq}. 

Lemma 3.1. For any a, k > 0, 

limsupC/" ( |Aq {a)\ — 2ea\/nlogn > k\j2ea (nlogn)*") < 



Proof. Suppose n is large enough so that n{\ — e) /2> a^Jn logn. Then 

E"- {iA„ (,)!} = I/" m = m '+n"v^J 



n 

jeD" 



2eay^nlogn + 0(1). 
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The second moment satisfies the bound 



jSD" i,j€D":i^j 



.C/"riA ^,MT , inn|2 (1 + 2 LaVrilognJ)^ 



<E^ {\A^ia)\} + \D 



n{n — 1) 



= £;^"{|A«(a)|} + ^(i?^"{|A.(a)|})^ 
which imphes 



Var^" {\Aa < lea^J n log n + 4e2a2 log n + O (1) . 

Applying Chebyshev's inequality and letting n — )• oo yields the required 
result. □ 

Next, we consider (Hf^J n under P^*^. For this we shall need the 
following lemma whose proof is in Section 5. 

Lemma 3.2. Let n,t G N, let B C [n] he a random set and let D C [n] 
he a deterministic set, and suppose that for some c > 

Then, denoting K = E^^^ |D n for any r S (0, 1), 

K+{1- c2) 



P^^{\Br\Dr\A^\ <r ■ E'^^{\Br\Dr\A*\]) < 



{l-rfc^K'^ 



Let Rj^t and R^l denote the events {j G Aq, (n^)} and {j G Aq, (H^)} n 
{j ^ ^*}, respectively. Let pt^n — P^^ {j £ (which is, of course, indepen- 
dent of j). 



Lemma 3.3. Let v (a) = 1 — 4^' (^ay |j ■ Let tn < |nlogn and suppose 
a satisfies v (a) > 0. Then, for any r € (0, 1), 

limsupi^'^ (I A„ (HtJ n A*" I < rv (a) enpt„,n) 

n— >oo 

<(l-r)"2 (a)-l). 
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Proof. With S (n, a) defined by 

S {n, a) 



= min max 111^ (j) - j\ < ay^nlogn 

jeD" \ 0<m<|nlogn 

<min/>",(i?,,JjGALt"^°^"J) 
Lemma 3.2 yields 

pr,{\^a{ntJnA'-\<r. El, { I A„ (ni„ ) n | } ) 



< 



{l-rfSHn,a)Kl 



where Kt„ = E^^ { | n A*- 1 } . 

A simple calculation shows that lim, 
tn = Lf'T-logreJ) implies that 



in^ooKt^ = oo. Theorem 2.1 (with 



(3.1) liminf5(n,a) > 1 - 4"^ \ a\ - ] = v (a) > 0. 

n->-oo \ V 3 / 

Therefore 

(3.2) limsup/^", (|A„ (HtJ n < r • E:,{\Ao. (HtJ n ^*"|}) 

n— >oo 

1_ (l-52(n,a)) 
< 5- lim sup ■ 



{1-rf n^oo 52 (n, a) 



< 



(l-r)-2 (-t;-2 (a)-l). 



Note that by (3.1), for any 5 G (0, 1) and sufficiently large n, 
El,{\A^ (Hi J n = Pr, {R,,tjj e A*-) {j g A*-) 

> \D'^\pt,nS (n, a) > 5f (a) enpt^^n- 
Together with (3.2), this implies 

lim sup i^';^ ( I (ni„ ) n I < rv (a) enpt^ 

n— >-oo 

< (l-r/5)-2 (^;-2 (q)-1). 
By letting 6^1, the lemma follows. □ 
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Lastly, we consider (Hj^) \ A^" . For this we will need the following 
proposition whose proof is given in Section 5. 

Proposition 3.1. Denote the chain corresponding to random-to-random 
insertions shujfie starting at id G Sn by {11^}^^^ and the chain correspond- 
ing to random-to-random insertions shuffle with initial distribution [/" by 
{n^"}^^Q. For any n > 1, t < [|?ilognJ and different i,j G [n], the two 
chains can be coupled so that i and j are chosen for removal at the same 
times for both and, under the corresponding measure Pl^fjn , 



(yf{k)-IiT {k)\>pi\og^n\Bl^<p2n~\ 

for k = where B\ ■ is the event that i and j have been removed up to 
time t and pi, P2 are constants independent of n. 

With the proposition, as before, we can evaluate first and second moments 
and apply Chebyshev's inequality. 

Lemma 3.4. Let t^ < |_|nlognJ be a sequence and let k > 0. Let pi he 
the constatnt from Proposition 3. 1 . Then, 



lim sup i^-^ (I A, (n^ J \ A*" I - 2ea (1 - T^n^ 



n 



...>k- ys^^en^/^ log^/^ - 

Proof. Denote the first time a card j € [n] is removed by r(j). By 
definition, (n^ (j) € ■{ t (j) = k) = n", where denotes the uniform 
measure on [n]. Since the shuffles at times /c + 1, A; + 2, . . . are indepen- 
dent of the shuffle at times 1, . . . ,k, they are also independent of the event 
{t (j) = k}. As a marginal distribution of C/", the stationary distribution 
of the chain {n^ (j)}m=o coincides with n". Therefore, for m > 0, 
P^ (Ilk+m {j) ^ '\t (j) = k) = It", and, as a mixture of measures of this 
form, P^ (n^ (j) G "I j ^ A*) coincides with as well. 

Thus, if en > a\Jn log n. 
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For the second moment write 

jeD" i,jeD:i^j 

= ^S{|A«(n,)\^*|}+ ^ pr,{Rt:nRf;). 

i,j&D:i^j 

According to Proposition 3.1, for i j, n > 1 and t < [|nlognJ , 



p::^ [Rf, n Rf, A') = i,i G A, n 



B. 



< Pl^u- (i) - Jt (i) -A< a^Jnlogn + pi log^ 



n 



nf(i)-n,^ (j) 
nj'^ (z) - nr (0 



> pi log^ n 

> pi log^ n 



(l + 2 UV^nogn + pilog^nl)^ _2 
< : ^ 1- 2p2n . 



n{n — 1) 



Therefore, 



i,jeD:iytj i,jeD:iy^j 



< \D'' 



'l + 2[a^/nTogn + pilog^nlY _^ 

— h 2,p2n 

n[n — 1) 

t\ 



X 1 - 2pt,n + 



n - 2 



n 



By straightforward algebra, the above yield 

Var^a { I A„ (Hi J \A'\}< 8piae\^/^ log^/^ n + (y^nl^ 



n 



Applying Chebyshev's inequality and taking the limit completes the proof. 

□ 

Remark 3.1. Based on Lemma 3.1 and Lemma 3.4, combined with a 
'coupon-collecting' argument to bound the probability 
Pid (|^*"| / \/nAogn > S), it is easy to see that for tn as in Theorem 1.1, as 
long as c„ does not approach oo too fast, 



Aq| / (2eaY^nlogn) — t- 1 in probability. 



under U"' and P^. 
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4. Proof of Theorem 1.1. In order to distinguish the two measures 
we consider the deviation of | A^l from 2eayjn logn. Let A;, c > 0, and let a 
such that V (a) > (where v [a) was defined in Lemma 3.3) and assume t„ 
is as in the theorem. Suppose that for some n 
(4.1) 

I A„ {lit) \ I - 2£a (1 - pt^,n) ^nlogn < k^S^en^'^ log^/^ n, 



and 

(4.2) I A„ (HtJ n A*" I > (a) enpt„,„. 



Then, if n is sufficiently large, 



(4.3) 



|Aq (nt„)| - leayjnlogn 

> enpt^^n ^-^v (a) — 2a\/nlogn/n^ — ky^Spiaen^^^ log^^^ 

= \^npt,,,n {v (q) - o(l)) - /cv^Spioen^/^ log^/^ n 



n 



where the last inequality follows from the following calculation. 
Writing 

-TfZ. — 574~ = 7 " ~ 7 log n + logpt„,„, 
n^/^log ' n 4 4 

substituting pj„,n and t„ and using the fact that log (1 + x) = x + O (x^) as 
X — 7- we arrive at 

1/41 5/4 = Cn + O (1) ^ OO. 

n^/* log ' n 

Now, since, for large n, (4.1) and (4.2) imply (4.3), by a union bound. 
Lemmas 3.3 and 3.4 imply 

liminf (\Aa {UtJ\ - 2eay^n\ogn > cn^/^log^/^n 



In addition, from Lemma 3.1, 

limsupt/" (|A„ (ct)| - 2eaVnlogn > cn^/^ log^/^ n) < 0. 

Thus, since k and a were arbitrary, 

liminf \\P^"a {Ut G •) - U^rv = 0- 

n— >-oo 

□ 



A LOWER BOUND FOR THE RANDOM INSERTIONS SHUFFLE 



19 



5. Proofs of Lemma 3.2 and Proposition 3.1. In this section we 
prove Lemma 3.2 and Proposition 3.1. We begin with the lemma, which, 
since cards are chosen independently each step, is basically a result related 
to the Coupon-Collector's problem (see Feller (1968)). 

5.1. Proof of Lemma 3.2. By our assumption, 

E:^{\BnDnA'\} = Y.Pltd U G B\j G A*) PJ^ {j G A') > cK. 

jeD 

Write 

E^,{\BnDnA'f] 
<E^,{\DnA'\'}= P^A^^j^A') 

i,jeD 

jeD i,jeD:i^j 

= K + \D\{\D\-l) (^)*- 
Since K = \D\{{n - l)/n)* it follows that 

therefore 

Var2i{\BnDnA*\} <K+{1-c^)K^. 
Applying Chebyshev's inequality completes the proof. □ 

5.2. Proof of Proposition 3.1. Define a coupling of 11^ and 11^" as fol- 
lows. At each step remove a card from each of the decks as described below, 
choose a random position and reinsert both cards in this position in both 
decks. Denote the positions of i and j in both decks at time m (i.e. after the 
m-th shuffle) by 

A^^{nt:^(i),nj^(j),nr(i),nr(j)}. 

On the m-th shuffle, with probability 1 — | Am„i | /n choose a random position 
from [n] \ A^-i, and remove from both decks the card in this position. 
With probability 1/n remove the card numbered i from both decks. With 
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probability 1/n do the same for j. Lastly, with probability (|Am-i| — 2) /n 
choose a random position from 

A„_i\{nt:^_i(i),nj^_i(j)} 

and a random position from 

A™„i\{nr„i(i),Cli(j)} 

and remove the cards in these positions from the decks respectively. Denote 
the measure corresponding to the coupling by Pj^^jjn ■ 

Note that by the definition of the coupling, P-^ijn = PidU" ■ Therefore it 
is enough to prove the inequality in the proposition only for j. Also note 
that we can, and shall, assume without loss of generality that n > 9, since 
if the inequality is true for such n, by adjusting the constants pi, p2 it will 
hold for smaller n. 

Let X > 1, let to = Lf'^lognJ and fix t < to- Define for each m> 1, 



T 



mm 



n n 



{Ki U) , nr U)} , max [uii (j) , nr (i)} 

Im = \Tm\Am\ and Om = I W \ (T„ U Am) I . 

That is, Im is the number of positions in the range determined by the 
positions of j in both decks and is the number of positions out of this 
range, with both sets in addition excluding A^- 

Let Hm denote the event that the card which was removed on the m-th 
shuffle was in a position in [n] \ A^-i and was not reinserted in a position 
which is greater by 1 from a position in A^ and not reinserted in position 
1. Define r] (j) to be the last time j is removed before or at time t, and 
r] (j) = 0, if it has not been removed up to t. Define 

= min {m > ry (j) : |nj^ (j) - (j) | > x} . 

Finally, define the process 

nim{'ri{j)+m,'Sx} 

Lm= ^ {h — h-i) 'i-Hk, 

k=rj{j)+l 



where li;' is the indicator function of a set F. The reason for defining Lm is 



that, as we shall see, conditioned on i?* •, its increments satisfy the condition 



of Proposition 2.1. 
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Assume Bjj occurs. If on the m-th shuffle the card numbered j is chosen, 
then (j) = n^" (j), and thus J^q) = 0. Hence, for any time r] (j) < s < 



{Im - Im-l)'i-H„,. + ^ {In 



Lm-l) i-HS, 



For any i] {j) < m < t, j is not removed at time m, and therefore, as can be 
easily seen, \{Im - Im-i) 'i-H^ \ < 2. 

Now, stih under the assumption that occurs, if 'dx < t, then, since by 
definition 



< 



OO-nr^ (j) 



It < 3, 



it holds that 



max 

0<m<t 



+ 2 ^ 1//^ > L^^_^(j)+ ^ {Im - Im-l) = I't}^ > X-3. 

'"='?(i)+i 



m=l 



Therefore for < u, 6 such that u + 5<x — 3, a union bound gives 



ur{j)-u^'^ (j) 



> X 



(5.1) 



max Lm > u 

Q<m<t 



t 



V r?i=l 



Next, we apply Proposition 2.1 to conditioned on i?*^-, in order to 



bound its exceedance probability in (5.1). For brevity, denote m' = m+rj {j). 
First of all, note that given that m' + 1 < i^a,, L^+i = Lm + 1 (similarly, 
Lm+i = Lm — 1) if and only if the position of the card that is chosen for 
removal on step m'+l is in [n]\{Tm' U A^') (respectively, Tm'\-^m')i and its 
position after reinsertion is greater by 1 from some position in Tm/_|_i\A„^/_|_i 
{[n] \ (Tm/_|_i U Am'+i)). Therefore, for d = ±1, the number of different 
insertion shuffles (i.e., choices of a card and a position) such that Lm+i = 
Lm + d is Im'Om'- Similarly, it can be easily seen that the number of different 
insertion shuffles such that Hm'+i occurs is {Im' + O-m') {Im' + O-m' — !)• 
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Thus, for d = ±1, 



(5.2) 



Hm'+i, m' + 1 < -dx, Im' = a, O^' = b 

^id,U" (-^m'+l ~ ^m' = d Im' = a, Om' = b, Hm'+l 

f (a+fe)(a+6-i) ifa + 6>l, 
|0 ifa + 6<l. 



Since the only nonzero increments of Lm are ±1 and {L^+i 7^ -^^m} C 
w-' + 1 < it fohows that 



-^id'!(7" ( ^m+l — km + 1 



AdditionaUy, if 



Lm+1 / Lm, {Lp}^^i — {kp}^^ 



^id,U"' i^ij^ L„i+i / L„i, {Lp}"^^-^ - {A;p}™^J > 



then 



< max -f^^yrrn ( / -^^m 



-^ij' {-^p}p=i = {kp}p=i ,■■ ■ 
Hm'+i, m' + 1 < -dx, Im' = a, Om' = b) , 



where the maximum is taken over all values of a and b such that the condi- 
tional probability is defined. By definition, if m' + 1 < 'dx, then Im' < x. In 
addition, for any m, Om + I-m ^ n — 4. Hence, from the last inequality and 
(5.2) we have (recall we assumed that n > 9), 



< max 



2ab 



< 



2x 



0<a<x {a + b){a + b- 1) n - 4 ' 
a+6>n— 4 
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By Proposition 2.1 applied to as a process conditioned on Bjj, we 
have, for < u < x, 



(5.3) 



^"'^ > 0<m<t 



/ yO<m<t J 



where Sh, is a random walk with increment distribution 



/.^(l)=/.^(-l) 



n - 4 



(0) = 1 



2x 
n - 4' 



Recall that t < = |_|nlognJ. By Levy's and Bernstein's inequalities 
(Petrov (1995), Theorem 2.8), for < n < x. 



(5.4) < 2P {Sf;^ >u)< 2exp 



0<m<to 
2 



ti"' (n - 4) 
x8to 



< 2exp 



12x log n 



Now let us bound the other summand of (5.1). For n > 9, since \Am\ < 4, 
it can be easily seen that 



(5.5) 



^ m=l 



<P{Qt>^^-2\ <P (Qt, > ^ - 2 ) , 



where Qt ~ Bin (t, ^) and similarly for to- For 6 > 27 log n + 4, Bernstein's 
inequality gives 



(5.6) 



„ , ^ 5 \ r 5/2 - 2 - 9to/n 
P(Qto> 2-2j <exp|-^ ^-L- 

(6 1 27, 

< exp < 1 j log n 

- ^1 8216 ^ 



From (5.1) and (5.3)-(5.6), choosing x = lOOlog^n, u = 501og^n and 
6 = 30 log n + 4, for example, we obtain 



P^a,u-{\uf'{j)-ur (j) 



> 100 log^ n 



□ 
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